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Abstract. The leafage of a digraph is the minimum number of leaves in a host 
tree in which it has a subtree intersection representation. We discuss bounds on 
the leafage in terms of other parameters (including Ferrers dimension) , obtaining 
a string of sharp inequalities. 



1. INTRODUCTION 

An intersection representation of a digraph D assigns an ordered pair (S v , T v ) to each 
vertex v € V(D) such that uv G E(D) if and only if S u fl T v ^ 0. We call S v and T v the 
source set and sink set of v. This model was first described by Beineke and Zamfirescu [1] 
under the name connection digraph. An essentially equivalent model in terms of bipartite 
graphs was introduced by Harary, Kabell, and McMorris [7]. 

When each set in an intersection representation is a subtree of a fixed host tree, we 
have a subtree representation. Every n-vertex digraph has a subtree representation in a 
star with n leaves. Not every digraph has a subtree representation in a path; those that 
do are the interval digraphs, which are characterized in [15,16]. We define the leafage 
1(D) of a digraph D to be the minimum number of leaves in a host tree in which D has 
a subtree representation. Thus leafage is a measure of distance from an interval digraph, 
and the subtree representations in stars show that 1(D) < n(D). An analogous parameter 
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for chordal (undirected) graphs is studied in [8]. Further results about adjacency matrices 
of interval digraphs appear in [9,10,15,16,17]. 

We obtain lower bounds on leafage using the idea of Ferrers dimension. The successors 
of a vertex v are {u G V(D): vu G E(D)}; the predecessors are {u G V(D): uv G E(D)}. 
A digraph is a Ferrers digraph [14] if its successor sets are linearly ordered by inclusion, 
which is equivalent to the adjacency matrix A(D) having no 2 by 2 permutation submatrix. 
Viewing a digraph as a relation D C V(D) x V(D), the Ferrers dimension f(D) of D is 
the minimum number of Ferrers digraphs on V(D) whose intersection is D (introduced in 
[2]). Since the complement in V x V of a Ferrers digraph is also a Ferrers digraph, this 
also equals the minimum number of Ferrers digraphs whose union is D. 

Interval digraphs all have Ferrers dimension at most 2; a digraph D is an interval 
digraph if and only if D is the union of two disjoint Ferrers digraphs [15]. This generalizes 
to a lower bound on 1(D) using Ferrers digraphs. Let f*(D) denote the minimum number 
of pairwise disjoint Ferrers digraphs whose union is D. These are Ferrers digraphs whose 
intersection is D and whose pairwise unions are V(D) x V(D). Having imposed an extra 
condition on the minimization, we have f*(D) > f(D); we prove that 1(D) > f*(D). 

On the upper side, we study the related catch leafage l*(D) of a digraph D. This 
is the minimum number of leaves in a host tree in which D has a subtree representation 
such that each sink subtree is a single vertex. (Such representations, particularly when 
the host tree is a path, are studied in [12,13,15].) This condition restricts the allowable 
representations, so l*(D) > 1(D). We prove that l*(D) < w(P(D)), where w(P(D)) is the 
width of the inclusion poset P(D) on the sets whose incidence vectors are the columns of 
the adjacency matrix A(D). We also give a sufficient condition for equality in this bound. 

We thus obtain the chain of inequalities 

f(D) < f*(D) < 1(D) < l*(D) < w(P(D)) < n(D). 

We present examples to show that each inequality is best possible. We also present exam- 
ples to show that each bound is arbitrarily weak, as any one of these parameters can be 
at most 3 when the next parameter is arbitrarily large. 

The upper bound w(P(D)) is easily computable, but the lower bounds are not. Cogis 
[2] and Doignon, Ducamp, and Falmagne [4] proved an easily testable characterization of 
the digraphs with Ferrers dimension at most 2, but Yannakakis [18] proved that recognition 
of Ferrers dimension 3 is NP-complete. Miiller [11] found a polynomial-time recognition 
algorithm for interval digraphs (leafage 2). Other than this, we do not know the complexity 
of recognizing digraphs with bounded values for any of {f*(D), 1(D), l*(D)}. 

2. SUBTREE REPRESENTATIONS AND LEAFAGE 

We use u — > v to denote the successor relation; u — > v means u uv is an edge" . A branch 
point of a tree is a vertex of degree at least 3. We show first that leafage is well-defined. 

THEOREM 1. If D is a digraph with n vertices, then D has a subtree representation 
in a star with at most n leaves. 
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Proof: In a star H with n leaves, assign distinct leaves as sink sets for the n vertices. 
For each v G V(D), let S v be the star induced by the center of H and the leaves corre- 
sponding to the successors of v. Then u — > v if and only if T v C S u , and hence this is a 
representation. ■ 

The bound 1(D) < n(D) is sharp, as it holds with equality for the digraph D n in 
Theorem 2. Our tool for proving lower bounds on 1(D) is a property of subtrees of a tree. 
If Ti, Tj, Tk are subtrees of a tree, then we say that Tk is between T and Tj if Tj C\Tj = 
and the unique path from Tj to Tj contains a vertex of Tk (possibly at the start or end) . 
A collection of pairwise disjoint subtrees having the property that none is between two 
others is an asteroidal collection of subtrees. 



LEMMA 1. If Ti, . . . , T n is an asteroidal collection of subtrees of a tree T, then T has 
at least n leaves. 

Proof: We may assume that the path from any leaf of T to the nearest branch point 
contains a vertex of some Tj; otherwise, we could delete the vertices before the branch 
point to reduce the number of leaves without changing the hypotheses. For each leaf v of 
T, we assign to v the first subtree encountered on the path from v to its nearest branch 
point. If T has fewer than n leaves, then some subtree Tk in our list is not assigned to any 
leaf. Let x be a vertex of Tk, and let P be a maximal path containing x. The endpoints 
of P are leaves of the tree, and Tk is between the subtrees assigned to those leaves. Hence 
T must have at least n leaves. ■ 



LEMMA 2. If v,w have a common successor u that is not a successor of z in a digraph 
.D, then S z is not between S v and S w in any subtree representation of D. Similarly, if 
v, w have a common predecessor u that is not a predecessor of z in D, then T z is not 
between T v and T w in any subtree representation of D. 

Proof: If S z is between S v and S w , then S v H S w = 0, and T u must contain the unique 
path from S v to S w in the host. This contradicts S z H T M = 0, since 5^ has a vertex on 
this path. The proof of the other statement is similar. ■ 

Subtrees of a tree satisfy the Helly property; the members of a pairwise intersecting 
family of (sub)trees have a common vertex (see, for example, [6, p. 92]). 

LEMMA 3. If in a subtree representation of D the source subtrees are pairwise inter- 
secting and the sink subtrees are pairwise intersecting, then A(D) has a row of l's or 
a column of O's, and similarly A(D) has a column of l's or a row of O's. 
Proof: In such a representation, the source subtrees have a common vertex, and the sink 
subtrees have a common vertex. Let s, t denote these vertices, respectively. If s = t, then 
A(D) is all l's and the claim holds. If s ^ £, let x be the vertex of |J Si that is closest to 
t on the unique s,t-path in T. Suppose x G Sk- If A(D) has no row of l's, then x ^ t 
and some sink subtree Tj fails to contain x. However, t G Tj, and hence Tj intersects no 
source subtree, forcing a column of O's in A(D). The other claim follows by considering 
the vertex of IJ^j that is closest to s on the s, t-path in T. ■ 
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Because permuting rows or columns is simply a relabeling of source or sink sets, leafage 
can be viewed as a property of a 0,1-matrix (the adjacency matrix A(D)) rather than a 
property of a digraph. We next show that asteroidal collections are forced by complements 
of permutation matrices. 

THEOREM 2. For n > 3, let D n be the digraph such that A(D n ) = J - i", where J 
is the matrix of all l's and / is the identity matrix. In every subtree representation 
of D n: either the source subtrees have a common vertex and the sink subtrees form 
an asteroidal collection, or the sink subtrees have a common vertex and the source 
subtrees form an asteroidal collection. 
Proof: Let the vertices of D n be {1, . . . , n}; we have i — > j if and only if i ^ j. Consider a 
subtree representation of D n . By Lemma 3, the source subtrees and sink subtrees cannot 
both be pairwise intersecting; we may assume by symmetry that there is a disjoint pair of 
source subtrees. 

When k are distinct vertices, we have i — > /c, j — > k, and k k. Thus Lemma 2 
implies that no source subtree can be between two other source subtrees. With between- 
ness forbidden, Si and Sk cannot intersect if Si D Sj = 0. We conclude that if some pair 
of source subtrees is disjoint, then the source subtrees are pairwise disjoint, and none is 
between two others. Hence they form an asteroidal collection. 

With the source subtrees pairwise disjoint, consider the sink subtrees. For any distinct 
vertices k, we must have Tj containing the path from Si to Sk and Tj containing the 
path from Sj to Sk- Hence Ti n Tj ^ 0, and the sink subtrees are pairwise intersecting. 
The Helly property then implies that the sink subtrees have a common vertex. ■ 

Together, Lemma 1 and Theorem 2 imply that l(D n ) = n. 

3. LEAFAGE AND DISJOINT FERRERS DIMENSION 

We next prove our main lower bound on leafage. We use N^(u) to denote the successor 
set and Np(u) to denote the predecessor set of a vertex u in a digraph D. 

THEOREM 3. If D is a digraph, then 1(D) > f*(D). 

Proof: Suppose that 1(D) = k, and let {(S V ,T V ): v G V(D)} be a representation of D 
in a host tree with k leaves. When k = 2, the result follows from the characterization 
of interval digraphs in [15]. For k > 3, we construct k pairwise disjoint Ferrers digraphs 
whose union is D. With the host tree T embedded in the plane, let the leaves be x±, . . . , x n 
in clockwise order around the tree. Let Pi denote the path in T from Xi to Xj+i, indexed 
cyclically. 

For each leaf Xi of the host tree T, we construct an associated Ferrers digraph D(i). 
The edges of D consist of those pairs uv such that S u fl T v = 0, meaning that the unique 
shortest path from S u to T v has length at least 1. Let D(i) consist of those edges uv in D 
such that the first edge on the path from S u to T v lies on Pi, with 5^ between Xi and T v 
(see Fig. 1). 
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If S u has no vertex on Pi, then u has no successors in D(i). If the last vertex of S u ' 
on Pi is closer to Xi + i than the last vertex of S u on Pi, then N^^(u') C N^^(u), by 
construction. Hence the D(z)'s are Ferrers digraphs. 

The paths Pj together cover each edge of the host tree exactly once in each direction. 
Since each edge is covered in each direction, [j i D(i) = D. Since each edge is covered only 
once, and when S u fl T v = there is a unique first edge on the path from S u to T v , the 
subgraphs {D(i)} are pairwise disjoint. ■ 




Fig. 1. Ferrers digraphs from subtree representation 



This provides another proof that the leafage of the digraph D n is n. Since each pair 
of ones on the diagonal of A(D n ) induce a 2 by 2 permutation submatrix, no pair of them 
can be covered by a single Ferrers digraph contained in D n . 

Although the inequalities f(D) < f*(D) < 1(D) < n(D) are best possible, with equal- 
ity throughout when D = D n , the gaps can be arbitrarily large. For an interval digraph, 
f(D) = f*(D) = 1(D) = 2. By the characterization of interval digraphs in [15], f*(D) = 2 
implies 1(D) = 2. Nevertheless, there exist digraphs D with f*(D) = 3 and 1(D) = n(D). 



THEOREM 4. Leafage is not bounded by any function of /* when /* > 3. In particular, 
let E n be the n-vertex digraph with A(E n ) = (yt Y q ), where / denotes the n — 1 by 
n — 1 identity matrix and Y denotes a column vector of n — 1 ones. If n > 3, then 
l(E n )=n, but f*(E n )=f(E n )=3. 

Proof: Because the last three rows and columns of A(E n ) form a row permutation of 
A(D%), we have f*(E n ) > f(E n ) > 3. For equality, partition the zeros of A(E n ) into three 
sets; those in the upper right of the submatrix /, those in the lower left of the submatrix 
/, and the in the lower right corner. These sets yield Ferrers digraphs, so f*(E n ) < 3. 

To show that l(E n ) = n, we name the vertices by the row and column indices of the 
matrix and let {(Si,Ti): 1 < i < n} be a subtree representation of E n in the host tree 
T. By Lemma 1, it suffices to show that the source subtrees or the sink subtrees form an 
asteroidal collection in T. 
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We have S n fl T n = 0; let P be the unique path from S n to T n in T. For each k < n, 
we have Sk H T n 7^ 0, S n nTk ^ 0, and fl T/- 7^ 0. Consider also i < n. If P contains a 
vertex a; of D Tj, then the nonadjacency of z and implies that a; separates and T/-. 
This contradicts Sk H Tfc 7^ 0, so cannot intersect Tj in P. We conclude that Si fl Tj is 
contained in the component of T — E(P) containing T n or in the component of T — E(P) 
containing S n . By symmetry, we may assume the latter. Since n — > i, we now have P C Tj. 
Applying this argument for all vertices other than n yields that all Si fl Tj lie in the same 
component of T — E(P), since there are no edges except loops among these vertices. Thus 
P C Ti and P n = for all i < n. 

Now consider disjointness and betweenness of the source subtrees. Since i — > i, n — > z, 
and j y4 i, Lemma 2 forbids between S^ and SVi for i,j < n. Since P separates S n from 
the others, this implies that the source subtrees are pairwise disjoint. Furthermore, if Sj 
is between Si and Sk for k < n, then the union of the paths from S n to the trees Si 
and Sk must intersect Sj, which puts Sj between S n and one of {Si, Sk}- Hence the source 
subtrees are pairwise disjoint, and none is between two others. They form an asteroidal 
collection, and Lemma 1 applies. ■ 

Every n by n (adjacency) matrix with leafage n is a minimal forbidden submatrix for 
leafage less than n. We next present another such family. Given the adjacency matrix 
A(D) of a digraph D, let H(D) be the graph with vertices corresponding to the zeros of 
A(D) and edges corresponding to the pairs of zeros contains in a 2 by 2 permutation sub- 
matrix. Cogis [2] and Doignon-Ducamp-Falmagne [4] proved that D has Ferrers dimension 
2 if and only if H(D) is bipartite; here we need only the obvious necessity of the condition. 

THEOREM 5. Let C n be the digraph consisting of a directed cycle of length n plus a 

loop at each vertex. Then l(C n ) = n, but f(C n ) = f*(C n ) = 3. 
Proof: Assume that the cycle is 1 — >2— >n— >1. Partition the zeros of A{C n ) into 
three sets: those in the last row, those in the first n — 1 rows below the diagonal, and the 
remainder. These sets form Ferrers digraphs, so f*(C n ) < 3. To prove that f(C n ) > 2, we 
observe that the positions 

{(i,i+ \n/2\): l<i< \n/2\} U {(i,i + 1 - \n/2\): \n/2] < i < n} 

form an odd cycle in H(C n ). 

We use induction on n to prove that l(C n ) = n. The claim holds for n = 3 because 
A(Cs) is a permutation of A(D 3 ). For n > 3, let T be the host tree for an optimal rep- 
resentation of C n . Suppose first that Si-\ fl Si 7^ for some i (all indexing is circular). 
The subtree Tj must intersect both of these, so by the Helly property Si-\,Ti, Si have a 
common vertex x in T. No other source subtree intersects Tj, and no other sink subtree 
intersects Si-i and 5^; hence no other assigned subtree contains x. Every two consecutive 
subtrees in the list T i+ \, Si + i,T i+2 , . . . , Si_2, Pi-i intersect; hence their union is connected 
and contained in one component of T — x. The remaining components of T — x can be 
deleted without changing the intersection digraph, so we may assume that a; is a leaf. 

Let P be the path in T from x to the nearest branch point. By symmetry, we may 
assume that Si-i contains as much of P as 5^. If Si does not contain all of P, then Tj+i 
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intersects Si-i, which is forbidden. Hence P C 5$ n Si-i, and no sink subtree other than 
Tj intersects P. If another source subtree extends onto P, then deleting its edges on P 
does not change the intersection digraph. We can now delete Tj and replace S^-i, Si by a 
single source subtree with edge set (E(Si-i) U E(Si)) — E(P) to obtain a representation 
of C n -i with l(C n ) — 1 leaves. By the induction hypothesis, this yields l{C n ) > n. 

Hence we may assume that Si-idSi = for all z, and by symmetry also Tj_i PlTj = 
for all i. In this case, let Pj be the portion of that is the unique Tj, Ti + i-path, and let 
be the portion of Tj that is the unique S^-i, S^-path. Note that Qi fl Pi and Pj n Qi+i are 
single vertices. The union of all these paths is thus a closed walk in which no consecutive 
edges are the same. Such a walk contains a cycle, which is impossible in a host tree. Hence 
this case does not arise. ■ 

We have presented examples with fixed f*(D) and large 1(D). Also f*(D) may be 
arbitrarily large when f(D) = 2. We construct a two-parameter family of adjacency ma- 
trices. The matrix M^ m is a km by km matrix consisting of k rows and k columns of m 
by m blocks. The diagonal blocks are the identity matrix, the blocks below the diagonal 
consist entirely of O's, and the blocks above the diagonal consist entirely of l's. The zeros 
can be covered by two Ferrers digraphs, each consisting of all the subdiagonal blocks and 
half of each diagonal block; hence f(Mf tjm ) = 2. We will prove that f*(M] ijm ) = c + 1 
when = 1 + (2) and m is sufficiently large. (In this discussion we use the notation m 
for both the digraph and its adjacency matrix.) 

Let I n denote the n-vertex digraph whose adjacency matrix is the identity. A partition 
of I n into c Ferrers digraphs can be viewed as a special c-coloring of the O's in the n by n 
identity matrix I n . We say that colors A, B are a crossed pair if A, B appear together in 
some row and appear together in some column. 

LEMMA 4. If n > 3c!/2, then every partition of I n into c Ferrers digraphs has a crossed 
pair of colors. 

Proof: The proof is by induction on c. For c = 2, a 2-coloring of the O's in the 3 by 3 
identity matrix cannot have all rows or all columns monochromatic without having a 2 by 
2 permutation matrix with O's in one color. For c > 2, let n = 3c!/2 and r = 3(c — l)!/2. 
Consider a partition of I n into c Ferrers digraphs, and suppose that the corresponding 
coloring has no crossed pair. 

Since each row of the identity matrix has n — 1 O's, the pigeonhole principle implies 
that each row has at least |~(3(c — l)!/2) (c/c) — 1/c] = r O's in some color. By symmetry, 
we may assume there are O's of color A in the first r columns of row r + 1 (see Fig. 2). Let 
D be the subdigraph induced by the first r vertices, with K the corresponding submatrix. 
By the induction hypothesis, every partition of D into c — 1 Ferrers digraphs yields a 
coloring of the O's in K with a crossed pair of colors. Hence we may assume that all c 
colors (including A) appear in K. 

Let i be the index of a row in K in which color A appears. If another color appears 
in row i of K, then it crosses A in the full matrix. Thus we may assume that row i of K 
has only color A. Now, to avoid the forbidden submatrix in color A, position i,r+l must 
have some other color B. Now colors A and B appear in a row together, so they cannot 
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appear in a column together. This contradicts the observation that every color, including 
B 1 appears in X. ■ 



K 



1 

A A 1 A A A 
1 

1 

1 



B 



A A A A A A 1 
Fig. 2. Coloring O's in an identity matrix. 

The bound 3c!/2 in Lemma 4 is not best possible. For c = 2, 3, 4, the bound is 3, 9, 36, 
but the actual minimum values forcing the desired behavior are 3,4,6. We are content 
with the bound arising from the short argument in Lemma 4 because our aim is to show 
that f*(Mk >m ) grows arbitrarily large. 

THEOREM 6. If k > 1 + g) and m > 3c!/2, then /*(M fcjm ) > c. 

Proof: Suppose Mk,m has a partition into c pairwise-disjoint Ferrers digraphs. By Lemma 
4, in each copy of I m in the block structure of Mfc jm , the corresponding coloring has a 
crossed pair of colors. Since there are more than g) diagonal blocks, by the pigeonhole 
principle some pair of colors A, B is crossed twice. 

Let r, s be the indices of the diagonal blocks where A, B are crossed, with r < s. Let 
j be the column within diagonal block r where A, B both appear, and let % be the row 
within diagonal block s where A, B both appear. Position i, j of block s, r is now forced to 
have both color A and color B to avoid the forbidden substructure for the Ferrers digraphs 
given by colors A and B. This is impossible. ■ 

It is worth noting that f*(Mk, m ) < c for all m when k < g). This is illustrated by 
the block coloring in Fig. 3. 
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Fig. 3. A 5-coloring of the O's in the blocks of Mio,;. 
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We previously gave examples with f(D) = 3, f*(D) = 3, and 1(D) large. We next 
prove that the family M k)Tn includes examples with f(D) = 2, f*(D) = 3, and 1(D) large. 

THEOREM 7. For m > 3, M 2 , m is a 2m-vertex digraph with f(M 2 , m ) = 2, f*(M 2 , m ) = 
3, and /(M 2>m ) = m. 

Proof: With four blocks of order m, M 2)Tn = (J j). The value /(M 2>m ) = 2 was obtained 
before Lemma 4. The lower bound on /* comes from Theorem 6 (with c = 2), and the 
upper bound comes from the coloring illustrated in Fig. 3 (with c = 3) . 

To prove that l(M 2) m) < m, we construct a representation with m leaves. Let the 
host tree be the union of m paths q, Sj, i, with g as a common endpoint. Let Sj and T +7n 
be the entire ith path, for 1 < % < m. Let Sj +m be the single vertex Sj, and let Tj be the 
single vertex tj. 

It remains to prove that l(M 2iTn ) > m. Consider a subtree representation with source 
subtrees Si, ... , S 2m and sink subtrees Ti, . . . , T 2m for the vertices indexed by the rows 
and columns of M 2 m in order. If T m+ i, . . . , T 2m have no common point, then some Tj, Tj 
among these are disjoint. Since Tj and Tj must intersect each of Si, ... , S m , those subtrees 
contain the Tj, Tj-path and hence have a common point. Similarly, if Si, ... , S m have no 
common point, then T m+ i, . . . , T 2m must. By symmetry, we may assume that Si, ... , S m 
have a common point q. 

We now show that T 1; . . . , T m is an asteroidal collection of subtrees. If Tj n Tj 7^ 
with i,j < m, then the entire path from q to the closest vertex of Tj C\Tj belongs to at least 
one of {Si, Sj}, which contradicts the requirement that each of {Si, Sj} intersects exactly 
one of {Tj, Tj}. If Tj is between Tj and T&, then let P be the path between Tj and T/c, and 
let r be the vertex of P closest to q. Depending on the location of r relative to Tj on P, 
the q, T fc -path in Sk or the q, Tj-path in Sj intersects Tj, contradicting their disjointness 
from Tj. Thus T 1? . . . ,T m is an asteroidal collection, and Lemma 1 implies that the host 
has at least m leaves. ■ 



4. CATCH LEAFAGE 

If D has a subtree representation in which every sink subtree is a single vertex, then 
we say this is a catch representation, and D is a catch-tree digraph. In discussing catch 
representations, we say "sink point" instead of "sink subtree" to make the usage clear. If 
D has a catch-tree representation in which the host is a path, then D is a catch-interval 
digraph. The corresponding classes in which the source sets are single vertices are merely 
those whose adjacency matrices are the transposes of the digraphs in the classes defined 
above. Catch-interval digraphs are characterized in [12] under the name "interval catch 
digraphs" and in [15] under the name "interval-point digraphs" . 

The catch leafage l*(D) is the minimum number of leaves in a host tree in which D has 
a catch-tree representation; the catch-interval digraphs are the digraphs with catch-leafage 
2. In the proof of Theorem 1, we gave every n- vertex digraph a catch representation in a 
star with n leaves, so catch leafage is well-defined. Since every catch-tree representation is 
a subtree representation, we have n > l*(D) > 1(D). 
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We may make several simplifying assumptions about the form of optimal catch-tree 
representations. In a catch-tree representation, sink subtrees can occupy the same vertex 
of the host if and only if the corresponding columns of the matrix are identical. We may 
split such a vertex of the host (without increasing the number of leaves), including the 
source subtrees to cover both. Thus we may assume that in catch representations each 
vertex is occupied by at most one sink point. Also, if a vertex of degree at most two in the 
host tree is not assigned as a sink point, then an edge incident to it can be contracted. 

Recall that the predecessor set for v is N~(v) = {u: u — > v}; this is the set whose 
incidence vector is the column of the adjacency matrix corresponding to v. Because the 
source sets occupy single vertices, a catch-tree representation can be described by listing, 
for each vertex of the host tree, the non-empty collection of source sets containing it. This 
will be a catch-tree representation if and only if 1) among these sets appear the predecessor 
sets, and 2) the set of vertices assigned to each source set forms a subtree of the host. 

Therefore, our analysis of catch leafage focuses on the columns of the adjacency ma- 
trix as incidence vectors for the predecessor sets. We define an associated partial order. 
Let -P(-D), the incidence poset of the digraph D, be the collection of predecessor sets in 
D, ordered by inclusion. For simplicity, we will use the same notation Vj to refer to a 
predecessor set or the column of the adjacency matrix that is its incidence vector. 

The width w(P) of a poset P is the maximum size of its antichains (collections of 
pairwise incomparable elements). Dilworth's Theorem [3] says that the elements of a finite 
poset P can be partitioned into w(P) disjoint chains. 

THEOREM 8. The inequality l*(D) < w(P(D)) holds for every digraph D with 
w(P(D)) > 2. 

Proof: Let k = w(P(D), and let C\, . . . , C\. be a partition of P(D) into k disjoint chains. 
Let the host tree T be a subdivision of a star with k leaves. That is, T consists of a 
central point of degree k from which k paths emerge. Assign the central vertex the set 
of all predecessors, and assign to each emerging path the sets on one of the chains C i: in 
decreasing order. The predecessor sets all appear at vertices, and the occurrences of each 
predecessor form a subtree, so this is a catch-tree representation. ■ 

Fulkerson [5] observed that Dilworth's Theorem is equivalent to the Konig-Egervary 
Theorem on matchings in bipartite graphs. Thus bipartite matching or other algorithms 
can be used to compute w(P(D)). Nevertheless, this is only a bound on l*(D), and this 
bound also can be arbitrarily bad. The digraph D consisting of a directed path plus a loop 
at each vertex has catch leafage 2 but w(P(D)) = n — 1, so w(P(D)) is not bounded by 
any function of l*(D). 

Note that w(P(D)) = 1 when D is a Ferrers digraph. Thus Theorem 8 requires 
w(P(D)) > 2, and we see that the break between w(P(D)) and n(D) can be large. 

We now have the chain of inequalities 

f(D) < f*(D) < 1(D) < 1*{D) < w(P(D) < n(D). 

One may have equality throughout (achieved by D n ). To prove that there can be arbitrar- 
ily bad breaks between any pair, it suffices to produce examples where 1(D) is bounded 
and l*(D) is large. To do this, we prove a sufficient condition for l*(D) = w(P(D)). 
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THEOREM 9. If D is a digraph such that P(D) has a unique maximal element and 
w(P(D)) > 2, then l*(D) =w(P(D)). 

Proof: Let Vq be the unique maximal element, and let A = V±, . . . , Vk denote a maximum 
antichain in P(D). Let qi denote the vertex of the host assigned to Vi in an optimal 
catch representation. Iteratively delete leaves of the host tree that are not in {qi} until 
all remaining leaves belong to {qi}. If the number of leaves (other than q ) is less than 
/c, then some set Vi in A is assigned to a non-leaf qi. Every path from qo to another 
remaining vertex can be extended to reach a remaining leaf. In particular, the path from 
go to qi belongs to a path from qo to a leaf assigned qj. Since Vj C Vq and each predecessor 
is assigned to the vertices of a tree, this entire path including qi belongs to the source 
subtrees for Vj. This yields Vj C Vi, contradicting the choice of A as an antichain. ■ 

THEOREM 10. Catch leafage is not bounded by any function of leafage. If F n denotes 
the n-vertex digraph whose adjacency matrix is (yt\), where I denotes the n — 1 by 
n — 1 identity matrix and Y denotes a column vector of n — 1 ones, then f*(F n ) = 
l(F n ) = 2, but P(F n ) =n-l. 
Proof: The upper left and lower right zeros in the portion / of the adjacency matrix yield 
two disjoint Ferrers digraphs whose union is F n . As proved in [15], this is equivalent to 
leafage 2. On the other hand, the predecessor set of the last vertex contains all the other 
predecessor sets, so l*(F n ) = w(P(F n )) = n — 1. ■ 

The sufficient condition in Theorem 9 does not characterize equality in l*(D) < 
w(P(D)). For the digraph C n consisting of a directed cycle plus loops, we have seen 
that l(C n ) = n. Also the columns of A(C n ) form an antichain, so l{C n ) = l*{C n ) = 
w(P(C n )) = n. 

This example shows also that leafage and catch leafage can drop arbitrarily much 
when a single vertex is deleted. Deleting one vertex from a cycle with loops leaves a path 
with loops. The former has leafage and catch leafage n; the latter has leafage and catch 
leafage 2. 

Our proof of l*(D) < w(P(D)) shows that every digraph has a catch representation 
in a host tree having only one branch point, and if P(D) has a unique maximum this can 
be achieved in a host tree with the minimum number of leaves. This is not true of all 
digraphs. The digraph D with adjacency matrix below contains C4 and thus has catch 
leafage at least 4. However, every catch representation of D in a host tree with four leaves 
has two branch points. We thus close by mentioning two further optimization problems for 
digraphs with catch leafage k: Among catch representations in trees with k leaves, what 
is the minimum number of branch points, and what is the minimum number of vertices? 

/I 1 0\ 
110 1 
110 1 
111 

\1 1 1 1 1/ 
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